Skip to content

Conversation

@friendlymatthew
Copy link
Contributor

Which issue does this PR close?

Rationale for this change

This PR implements comparison logic for Union arrays against opaque (non-union) arrays in the ord kernel

When comparing a union to an opaque array, the comparison first checks if the union's type id matches the opaque array's type, then compares the actual values if types align. This provides a consistent ordering where union variants are ordered by their type id relative to the opaque type

Notes about this PR

Please review from the second commit. The first commit being #8838

@github-actions github-actions bot added the arrow Changes to the arrow crate label Nov 20, 2025
@friendlymatthew friendlymatthew force-pushed the friendlymatthew/union-to-opaque branch from 5f9ca9b to 35ec44d Compare November 20, 2025 20:19
@friendlymatthew
Copy link
Contributor Author

cc @alamb (sorry for the constant pinging I feel)

@alamb
Copy link
Contributor

alamb commented Nov 21, 2025

No problem @friendlymatthew -- I'll put it on my review queue

I am currently putting my arrow-rs focus on getting the release out. I'll probably have more time for reviews next week

It looks like there are several CI failures with this PR

@friendlymatthew friendlymatthew force-pushed the friendlymatthew/union-to-opaque branch 2 times, most recently from e8a2b1e to 9b30e9a Compare November 21, 2025 15:21
@alamb
Copy link
Contributor

alamb commented Dec 3, 2025

No problem for the pings -- sorry for the delay -- I was focusing on getting the arrow releases out and then vacation/picking up DataFusion reviews. I now have some more bandwidth for arrow-r

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this PR @friendlymatthew -- I am not sure about this one. Sorry if I mislead you earlier

unreachable!()
};

let opaque_type_id = union_fields
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure about this -- it seems like it will compare the union array to the first field that has the same type.

What if the union array has multiple repeated types (I realize that is not a common use case, but I think it is possible)

},
(Map(_, _), Map(_, _)) => compare_map(left, right, opts),
(Union(_, _), Union(_, _)) => compare_union(left, right, opts),
(Union(_, _), _) => compare_union_to_opaque(left, right, opts),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any other example of comparing one array to an array of another type? I think this kernel expects the types to match exactly

I do see the argument about how UnionArray could be treated as a special case (as some sort of runtime typed thing) 🤔 but I am not sure UnionArrays are required to behave that way

I wrote up another way that your usecase might be handled

Let me know what you think

@friendlymatthew friendlymatthew marked this pull request as draft December 8, 2025 14:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arrow Changes to the arrow crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Extend comparison support for Union arrays against an opaque array

2 participants